Disfluencies in Switchboard
نویسنده
چکیده
Disfluencies (“um,” repeats, self-repairs) are prevalent in spontaneous speech, and are relevant to both human speech communication and speech processing by machine. Although disfluencies have commonly been viewed as ‘noisy’ events, results from a large descriptive study indicate that disfluencies show regularities in a number of dimensions [9]. This paper reports selected results on Switchboard and two comparison corpora of spontaneous speech. Results illustrate the systematic distribution of disfluencies, and highlight differences as well as universals across corpora and speakers.
منابع مشابه
A sequential repetition model for improved disfluency detection
This paper proposes a new method for automatically detecting disfluencies in spontaneous speech – specifically, selfcorrections – that explicitly models repetitions vs. other disfluencies. We show that, in a corpus of Supreme Court oral arguments, repetition disfluencies can be longer and more stutterlike than the short repetitions observed in the Switchboard corpus and suggest that they can be...
متن کاملThe Role of Disfluencies in Topic Classification of Human-Human Conversations
We investigate the impact of disfluencies on the task of classifying natural human-human conversations into topics. Disfluencies are distinctive to spoken language, and their effect on a number of spoken language understanding tasks, including spoken language classification, remains largely unknown. We use a subset of Switchboard-I annotated for disfluencies and topics, and investigate the effe...
متن کاملEarly Prosodic Manifestations of Disfluency
Theoretical models of speech production have hypothesized a relation between different types of disfluencies and the mechanisms responsible for them. Some disfluencies, such as filled pauses (e.g. ‘um’, ‘uh’) and repetitions (i.e. ‘the the’), are argued to arise from difficulty in planning, while cutoff disfluencies (e.g. ‘horiz-[ontal]’) are argued to arise from selfmonitoring. This distinctio...
متن کاملProsodic parallelism as a cue to repetition and error correction disfluency
Complex disfluencies that involve the repetition or correction of words are frequent in conversational speech, with repetition disfluencies alone accounting for over 20% of disfluencies. These disfluencies generally do not lead to comprehension errors for human listeners. We propose that the frequent occurrence of parallel prosodic features in the reparandum (REP) and alteration (ALT) intervals...
متن کاملProsodic parallelism as a cue to repetition disfluency
Repetition disfluencies are among the most frequent type of disfluency in conversational speech, accounting for over 20% of disfluencies, yet they do not generally lead to comprehension errors for human listeners. We propose that parallel prosodic features in the REP and ALT intervals of the repetition disfluency provide strong perceptual cues that signal the repetition to the listener. We repo...
متن کامل